Efficient extraction of schemas for XML documents

نویسندگان

  • Jun-Ki Min
  • Jae-Yong Ahn
  • Chin-Wan Chung
چکیده

In this paper, we present a technique for efficient extraction of concise and accurate schemas for XML documents. By restricting the schema form and applying some heuristic rules, we achieve the efficiency and conciseness. The result of an experiment with real-life DTDs shows that our approach attains high accuracy and is 20 to 200 times faster than existing approaches.  2002 Elsevier Science B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping DTDs to relational schemas with semantic constraints

XML is becoming a prevalent format and standard for data exchange in many applications. With the increase of XML data, there is an urgent need to research some efficient methods to store and manage XML data. As relational databases are the primary choices for this purpose considering their data management power, it is necessary to research the problem of mapping XML schemas to relational schema...

متن کامل

An Efficient Data Extraction and Storage Utility For XML Documents

In this paper, a mechanism to provide selective extraction of data objects from XML documents, the storage of these documents in an object-relational database, and retrieval and reconstruction of XML documents from extracted data objects is discussed. The motivation is provided by a need for a Workflow Process Repository in a Workflow Management System (WFMS) [6], namely METEOR WFMS, to store m...

متن کامل

Inference Document Type (Dtd) From Xml Document: Web Structure Mining

XML is becoming a prevalent format and defacto standard for data exchange in many applications. While traditionally, lots of data are stored and managed in relational databases. There is an urgent need to research some efficient methods to convert these data stored in relational databases to XML format when integrating and exchanging these data in XML format. The semantics of XML schemas are cr...

متن کامل

A Generic Load/Extract Utility for Data Transfer between XML Documents and Relational Databases

XML is rapidly gaining momentum in e-commerce and Internet-based information exchange, where its simplicity and custom-defined tags make it usable as a semanticspreserving data exchange format. However, to realize this potential, it is necessary to be able to extract structured data from XML documents and store it in a database, as well as to generate XML documents from data extracted from a da...

متن کامل

Measuring and Evaluating a Design Complexity Metric for XML Schema Documents

The eXtensible Markup Language (XML) has been gaining extraordinary acceptance from many diverse enterprise software companies for their object repositories, data interchange, and development tools. Further, many different domains, organizations and content providers have been publishing and exchanging information via internet by the usage of XML and standard schemas. Efficient implementation o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Process. Lett.

دوره 85  شماره 

صفحات  -

تاریخ انتشار 2003